AITopics | novel benchmark

Collaborating Authors

novel benchmark

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Novel Benchmark for NER in the Wastewater and Stormwater Domain

Cardillo, Franco Alberto, Debole, Franca, Frontini, Francesca, Aelami, Mitra, Chahinian, Nanée, Conrad, Serge

arXiv.org Artificial IntelligenceJun-3-2025

The effective management of wastewater and stormwater systems is crucial for urban sustainability and environmental protection. These systems, which form an integral part of public infrastructure, require structured information for monitoring, planning, and maintenance. However, much of the relevant information exists in unstructured textual formats, such as technical reports, regulatory documents, and maintenance logs. Extracting information from these sources is a key challenge, due to domain-specific terminology and the multilingual nature of regulatory and operational contexts. Typically a wastewater management information extraction application will require domain-specific entity recognition, followed by the extraction of relations between entities to support decision-making, automated reasoning, and linking to existing knowledge bases. The recent progresses in domain-specific Named Entity Recognition (NER) have the potential to greatly facilitate the development of such applications. However, to effectively evaluate this first and crucial step of the extraction pipeline, it is essential to establish a clearly defined set of extractable entities and construct a multilingual benchmark corpus . Building on previous work - carried out within the framework of a national project on just one language - we propose the following contributions: The starwars corpus, an aligned French-Italian corpus containing domain-specific texts.

annotation, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.01938

Country: Europe > Italy (0.29)

Genre: Research Report (0.50)

Industry: Water & Waste Management > Water Management (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Add feedback

AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale Games

Neural Information Processing SystemsMay-27-2025, 12:17:20 GMT

Decision-making in large-scale games is an essential research area in artificial intelligence (AI) with significant real-world impact. However, the limited access to realistic large-scale game environments has hindered research progress in this area. In this paper, we present AuctionNet, a benchmark for bid decision-making in large-scale ad auctions derived from a real-world online advertising platform. AuctionNet is composed of three parts: an ad auction environment, a pre-generated dataset based on the environment, and performance evaluations of several baseline bid decision-making algorithms. More specifically, the environment effectively replicates the integrity and complexity of real-world ad auctions through the interaction of several modules: the ad opportunity generation module employs deep generative networks to bridge the gap between simulated and real-world data while mitigating the risk of sensitive data exposure; the bidding module implements diverse auto-bidding agents trained with different decision-making algorithms; and the auction module is anchored in the classic Generalized Second Price (GSP) auction but also allows for customization of auction mechanisms as needed.

auctionnet, bid decision-making, decision-making, (9 more...)

Neural Information Processing Systems

Industry: Information Technology (0.59)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

From Text Segmentation to Smart Chaptering: A Novel Benchmark for Structuring Video Transcriptions

Retkowski, Fabian, Waibel, Alexander

arXiv.org Artificial IntelligenceFeb-27-2024

Text segmentation is a fundamental task in natural language processing, where documents are split into contiguous sections. However, prior research in this area has been constrained by limited datasets, which are either small in scale, synthesized, or only contain well-structured documents. In this paper, we address these limitations by introducing a novel benchmark YTSeg focusing on spoken content that is inherently more unstructured and both topically and structurally diverse. As part of this work, we introduce an efficient hierarchical segmentation model MiniSeg, that outperforms state-of-the-art baselines. Lastly, we expand the notion of text segmentation to a more practical "smart chaptering" task that involves the segmentation of unstructured content, the generation of meaningful segment titles, and a potential real-time application of the models.

smart chaptering, structuring video transcription, text segmentation, (1 more...)

arXiv.org Artificial Intelligence

2402.17633

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

CoheSentia: A Novel Benchmark of Incremental versus Holistic Assessment of Coherence in Generated Texts

Maimon, Aviya, Tsarfaty, Reut

arXiv.org Artificial IntelligenceOct-24-2023

Coherence is a linguistic term that refers to the relations between small textual units (sentences, propositions), which make the text logically consistent and meaningful to the reader. With the advances of generative foundational models in NLP, there is a pressing need to automatically assess the human-perceived coherence of automatically generated texts. Up until now, little work has been done on explicitly assessing the coherence of generated texts and analyzing the factors contributing to (in)coherence. Previous work on the topic used other tasks, e.g., sentence reordering, as proxies of coherence, rather than approaching coherence detection heads on. In this paper, we introduce {\sc CoheSentia}, a novel benchmark of human-perceived coherence of automatically generated texts. Our annotation protocol reflects two perspectives; one is global, assigning a single coherence score, and the other is incremental, scoring sentence by sentence. The incremental method produces an (in)coherence score for each text fragment and also pinpoints reasons for incoherence at that point. Our benchmark contains 500 automatically-generated and human-annotated paragraphs, each annotated in both methods, by multiple raters. Our analysis shows that the inter-annotator agreement in the incremental mode is higher than in the holistic alternative, and our experiments show that standard LMs fine-tuned for coherence detection show varied performance on the different factors contributing to (in)coherence. All in all, these models yield unsatisfactory performance, emphasizing the need for developing more reliable methods for coherence assessment.

generated text, incremental versus holistic assessment, novel benchmark, (2 more...)

arXiv.org Artificial Intelligence

2310.16329

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.53)

Add feedback

UIT-HWDB: Using Transferring Method to Construct A Novel Benchmark for Evaluating Unconstrained Handwriting Image Recognition in Vietnamese

Nguyen, Nghia Hieu, Vo, Duong T. D., Van Nguyen, Kiet

arXiv.org Artificial IntelligenceNov-10-2022

Recognizing handwriting images is challenging due to the vast variation in writing style across many people and distinct linguistic aspects of writing languages. In Vietnamese, besides the modern Latin characters, there are accent and letter marks together with characters that draw confusion to state-of-the-art handwriting recognition methods. Moreover, as a low-resource language, there are not many datasets for researching handwriting recognition in Vietnamese, which makes handwriting recognition in this language have a barrier for researchers to approach. Recent works evaluated offline handwriting recognition methods in Vietnamese using images from an online handwriting dataset constructed by connecting pen stroke coordinates without further processing. This approach obviously can not measure the ability of recognition methods effectively, as it is trivial and may be lack of features that are essential in offline handwriting images. Therefore, in this paper, we propose the Transferring method to construct a handwriting image dataset that associates crucial natural attributes required for offline handwriting images. Using our method, we provide a first high-quality synthetic dataset which is complex and natural for efficiently evaluating handwriting recognition methods. In addition, we conduct experiments with various state-of-the-art methods to figure out the challenge to reach the solution for handwriting recognition in Vietnamese.

machine learning, pattern recognition, unconstrained handwriting image recognition, (4 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/RIVF55975.2022.10013898

2211.05407

Genre: Research Report (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

Add feedback